{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b3d48bf3",
   "metadata": {
    "papermill": {
     "duration": 0.003779,
     "end_time": "2025-02-19T13:31:01.100590",
     "exception": false,
     "start_time": "2025-02-19T13:31:01.096811",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "**[Machine Learning Course Home Page](https://www.kaggle.com/learn/machine-learning)**\n",
    "\n",
    "---\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1acc8a2f",
   "metadata": {
    "papermill": {
     "duration": 0.003088,
     "end_time": "2025-02-19T13:31:01.107333",
     "exception": false,
     "start_time": "2025-02-19T13:31:01.104245",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "This exercise will test your ability to read a data file and understand statistics about the data.\n",
    "\n",
    "In later exercises, you will apply techniques to filter the data, build a machine learning model, and iteratively improve your model.\n",
    "\n",
    "The course examples use data from Melbourne. To ensure you can apply these techniques on your own, you will have to apply them to a new dataset (with house prices from Iowa).\n",
    "\n",
    "The exercises use a \"notebook\" coding environment.  In case you are unfamiliar with notebooks, we have a [90-second intro video](https://www.youtube.com/watch?v=4C2qMnaIKL4).\n",
    "\n",
    "# Exercises\n",
    "\n",
    "Run the following cell to set up code-checking, which will verify your work as you go."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3324bb0b",
   "metadata": {
    "papermill": {
     "duration": 0.002859,
     "end_time": "2025-02-19T13:31:01.113571",
     "exception": false,
     "start_time": "2025-02-19T13:31:01.110712",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "86455b68",
   "metadata": {
    "papermill": {
     "duration": 0.002753,
     "end_time": "2025-02-19T13:31:01.119433",
     "exception": false,
     "start_time": "2025-02-19T13:31:01.116680",
     "status": "completed"
    },
    "tags": []
   },
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d39f71c9",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-02-19T13:31:01.127174Z",
     "iopub.status.busy": "2025-02-19T13:31:01.126677Z",
     "iopub.status.idle": "2025-02-19T13:31:02.374069Z",
     "shell.execute_reply": "2025-02-19T13:31:02.372738Z"
    },
    "papermill": {
     "duration": 1.254516,
     "end_time": "2025-02-19T13:31:02.376767",
     "exception": false,
     "start_time": "2025-02-19T13:31:01.122251",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Setup Complete\n"
     ]
    }
   ],
   "source": [
    "# Set up code checking\n",
    "from learntools.core import binder\n",
    "binder.bind(globals())\n",
    "from learntools.machine_learning.ex2 import *\n",
    "print(\"Setup Complete\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e801c222",
   "metadata": {
    "papermill": {
     "duration": 0.003088,
     "end_time": "2025-02-19T13:31:02.383630",
     "exception": false,
     "start_time": "2025-02-19T13:31:02.380542",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## Step 1: Loading Data\n",
    "Read the Iowa data file into a Pandas DataFrame called `home_data`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "edf66de2",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-02-19T13:31:02.391084Z",
     "iopub.status.busy": "2025-02-19T13:31:02.390594Z",
     "iopub.status.idle": "2025-02-19T13:31:02.450877Z",
     "shell.execute_reply": "2025-02-19T13:31:02.449552Z"
    },
    "papermill": {
     "duration": 0.066945,
     "end_time": "2025-02-19T13:31:02.453712",
     "exception": false,
     "start_time": "2025-02-19T13:31:02.386767",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"outcomeType\": 1, \"valueTowardsCompletion\": 0.5, \"interactionType\": 1, \"questionType\": 1, \"questionId\": \"1_LoadHomeData\", \"learnToolsVersion\": \"0.3.4\", \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\"}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc33\">Correct</span>"
      ],
      "text/plain": [
       "Correct"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "# Path of the file to read\n",
    "iowa_file_path = '../input/home-data-for-ml-course/train.csv'\n",
    "\n",
    "# Fill in the line below to read the file into a variable home_data\n",
    "home_data = pd.read_csv(iowa_file_path)\n",
    "\n",
    "# Call line below with no argument to check that you've loaded the data correctly\n",
    "step_1.check()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "c85573a5",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-02-19T13:31:02.463253Z",
     "iopub.status.busy": "2025-02-19T13:31:02.462902Z",
     "iopub.status.idle": "2025-02-19T13:31:02.473193Z",
     "shell.execute_reply": "2025-02-19T13:31:02.472343Z"
    },
    "papermill": {
     "duration": 0.01731,
     "end_time": "2025-02-19T13:31:02.475230",
     "exception": false,
     "start_time": "2025-02-19T13:31:02.457920",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 2, \"questionType\": 1, \"questionId\": \"1_LoadHomeData\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#3366cc\">Hint:</span> Use the `pd.read_csv` function"
      ],
      "text/plain": [
       "Hint: Use the `pd.read_csv` function"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 3, \"questionType\": 1, \"questionId\": \"1_LoadHomeData\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc99\">Solution:</span> \n",
       "```python\n",
       "home_data = pd.read_csv(iowa_file_path)\n",
       "```"
      ],
      "text/plain": [
       "Solution: \n",
       "```python\n",
       "home_data = pd.read_csv(iowa_file_path)\n",
       "```"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Lines below will give you a hint or solution code\n",
    "step_1.hint()\n",
    "step_1.solution()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ed4e0f52",
   "metadata": {
    "papermill": {
     "duration": 0.003557,
     "end_time": "2025-02-19T13:31:02.482951",
     "exception": false,
     "start_time": "2025-02-19T13:31:02.479394",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## Step 2: Review The Data\n",
    "Use the command you learned to view summary statistics of the data. Then fill in variables to answer the following questions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "3e032cb6",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-02-19T13:31:02.492496Z",
     "iopub.status.busy": "2025-02-19T13:31:02.492144Z",
     "iopub.status.idle": "2025-02-19T13:31:02.605443Z",
     "shell.execute_reply": "2025-02-19T13:31:02.604215Z"
    },
    "papermill": {
     "duration": 0.120814,
     "end_time": "2025-02-19T13:31:02.607514",
     "exception": false,
     "start_time": "2025-02-19T13:31:02.486700",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Id</th>\n",
       "      <th>MSSubClass</th>\n",
       "      <th>LotFrontage</th>\n",
       "      <th>LotArea</th>\n",
       "      <th>OverallQual</th>\n",
       "      <th>OverallCond</th>\n",
       "      <th>YearBuilt</th>\n",
       "      <th>YearRemodAdd</th>\n",
       "      <th>MasVnrArea</th>\n",
       "      <th>BsmtFinSF1</th>\n",
       "      <th>...</th>\n",
       "      <th>WoodDeckSF</th>\n",
       "      <th>OpenPorchSF</th>\n",
       "      <th>EnclosedPorch</th>\n",
       "      <th>3SsnPorch</th>\n",
       "      <th>ScreenPorch</th>\n",
       "      <th>PoolArea</th>\n",
       "      <th>MiscVal</th>\n",
       "      <th>MoSold</th>\n",
       "      <th>YrSold</th>\n",
       "      <th>SalePrice</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1201.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1452.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "      <td>1460.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>730.500000</td>\n",
       "      <td>56.897260</td>\n",
       "      <td>70.049958</td>\n",
       "      <td>10516.828082</td>\n",
       "      <td>6.099315</td>\n",
       "      <td>5.575342</td>\n",
       "      <td>1971.267808</td>\n",
       "      <td>1984.865753</td>\n",
       "      <td>103.685262</td>\n",
       "      <td>443.639726</td>\n",
       "      <td>...</td>\n",
       "      <td>94.244521</td>\n",
       "      <td>46.660274</td>\n",
       "      <td>21.954110</td>\n",
       "      <td>3.409589</td>\n",
       "      <td>15.060959</td>\n",
       "      <td>2.758904</td>\n",
       "      <td>43.489041</td>\n",
       "      <td>6.321918</td>\n",
       "      <td>2007.815753</td>\n",
       "      <td>180921.195890</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>421.610009</td>\n",
       "      <td>42.300571</td>\n",
       "      <td>24.284752</td>\n",
       "      <td>9981.264932</td>\n",
       "      <td>1.382997</td>\n",
       "      <td>1.112799</td>\n",
       "      <td>30.202904</td>\n",
       "      <td>20.645407</td>\n",
       "      <td>181.066207</td>\n",
       "      <td>456.098091</td>\n",
       "      <td>...</td>\n",
       "      <td>125.338794</td>\n",
       "      <td>66.256028</td>\n",
       "      <td>61.119149</td>\n",
       "      <td>29.317331</td>\n",
       "      <td>55.757415</td>\n",
       "      <td>40.177307</td>\n",
       "      <td>496.123024</td>\n",
       "      <td>2.703626</td>\n",
       "      <td>1.328095</td>\n",
       "      <td>79442.502883</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>21.000000</td>\n",
       "      <td>1300.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1872.000000</td>\n",
       "      <td>1950.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>2006.000000</td>\n",
       "      <td>34900.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>365.750000</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>59.000000</td>\n",
       "      <td>7553.500000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>1954.000000</td>\n",
       "      <td>1967.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>2007.000000</td>\n",
       "      <td>129975.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>730.500000</td>\n",
       "      <td>50.000000</td>\n",
       "      <td>69.000000</td>\n",
       "      <td>9478.500000</td>\n",
       "      <td>6.000000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>1973.000000</td>\n",
       "      <td>1994.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>383.500000</td>\n",
       "      <td>...</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>25.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>6.000000</td>\n",
       "      <td>2008.000000</td>\n",
       "      <td>163000.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>1095.250000</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>80.000000</td>\n",
       "      <td>11601.500000</td>\n",
       "      <td>7.000000</td>\n",
       "      <td>6.000000</td>\n",
       "      <td>2000.000000</td>\n",
       "      <td>2004.000000</td>\n",
       "      <td>166.000000</td>\n",
       "      <td>712.250000</td>\n",
       "      <td>...</td>\n",
       "      <td>168.000000</td>\n",
       "      <td>68.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>8.000000</td>\n",
       "      <td>2009.000000</td>\n",
       "      <td>214000.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>1460.000000</td>\n",
       "      <td>190.000000</td>\n",
       "      <td>313.000000</td>\n",
       "      <td>215245.000000</td>\n",
       "      <td>10.000000</td>\n",
       "      <td>9.000000</td>\n",
       "      <td>2010.000000</td>\n",
       "      <td>2010.000000</td>\n",
       "      <td>1600.000000</td>\n",
       "      <td>5644.000000</td>\n",
       "      <td>...</td>\n",
       "      <td>857.000000</td>\n",
       "      <td>547.000000</td>\n",
       "      <td>552.000000</td>\n",
       "      <td>508.000000</td>\n",
       "      <td>480.000000</td>\n",
       "      <td>738.000000</td>\n",
       "      <td>15500.000000</td>\n",
       "      <td>12.000000</td>\n",
       "      <td>2010.000000</td>\n",
       "      <td>755000.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>8 rows × 38 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                Id   MSSubClass  LotFrontage        LotArea  OverallQual  \\\n",
       "count  1460.000000  1460.000000  1201.000000    1460.000000  1460.000000   \n",
       "mean    730.500000    56.897260    70.049958   10516.828082     6.099315   \n",
       "std     421.610009    42.300571    24.284752    9981.264932     1.382997   \n",
       "min       1.000000    20.000000    21.000000    1300.000000     1.000000   \n",
       "25%     365.750000    20.000000    59.000000    7553.500000     5.000000   \n",
       "50%     730.500000    50.000000    69.000000    9478.500000     6.000000   \n",
       "75%    1095.250000    70.000000    80.000000   11601.500000     7.000000   \n",
       "max    1460.000000   190.000000   313.000000  215245.000000    10.000000   \n",
       "\n",
       "       OverallCond    YearBuilt  YearRemodAdd   MasVnrArea   BsmtFinSF1  ...  \\\n",
       "count  1460.000000  1460.000000   1460.000000  1452.000000  1460.000000  ...   \n",
       "mean      5.575342  1971.267808   1984.865753   103.685262   443.639726  ...   \n",
       "std       1.112799    30.202904     20.645407   181.066207   456.098091  ...   \n",
       "min       1.000000  1872.000000   1950.000000     0.000000     0.000000  ...   \n",
       "25%       5.000000  1954.000000   1967.000000     0.000000     0.000000  ...   \n",
       "50%       5.000000  1973.000000   1994.000000     0.000000   383.500000  ...   \n",
       "75%       6.000000  2000.000000   2004.000000   166.000000   712.250000  ...   \n",
       "max       9.000000  2010.000000   2010.000000  1600.000000  5644.000000  ...   \n",
       "\n",
       "        WoodDeckSF  OpenPorchSF  EnclosedPorch    3SsnPorch  ScreenPorch  \\\n",
       "count  1460.000000  1460.000000    1460.000000  1460.000000  1460.000000   \n",
       "mean     94.244521    46.660274      21.954110     3.409589    15.060959   \n",
       "std     125.338794    66.256028      61.119149    29.317331    55.757415   \n",
       "min       0.000000     0.000000       0.000000     0.000000     0.000000   \n",
       "25%       0.000000     0.000000       0.000000     0.000000     0.000000   \n",
       "50%       0.000000    25.000000       0.000000     0.000000     0.000000   \n",
       "75%     168.000000    68.000000       0.000000     0.000000     0.000000   \n",
       "max     857.000000   547.000000     552.000000   508.000000   480.000000   \n",
       "\n",
       "          PoolArea       MiscVal       MoSold       YrSold      SalePrice  \n",
       "count  1460.000000   1460.000000  1460.000000  1460.000000    1460.000000  \n",
       "mean      2.758904     43.489041     6.321918  2007.815753  180921.195890  \n",
       "std      40.177307    496.123024     2.703626     1.328095   79442.502883  \n",
       "min       0.000000      0.000000     1.000000  2006.000000   34900.000000  \n",
       "25%       0.000000      0.000000     5.000000  2007.000000  129975.000000  \n",
       "50%       0.000000      0.000000     6.000000  2008.000000  163000.000000  \n",
       "75%       0.000000      0.000000     8.000000  2009.000000  214000.000000  \n",
       "max     738.000000  15500.000000    12.000000  2010.000000  755000.000000  \n",
       "\n",
       "[8 rows x 38 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Print summary statistics in next line\n",
    "home_data.describe()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "c00d7c8c",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-02-19T13:31:02.617975Z",
     "iopub.status.busy": "2025-02-19T13:31:02.617590Z",
     "iopub.status.idle": "2025-02-19T13:31:02.624937Z",
     "shell.execute_reply": "2025-02-19T13:31:02.623964Z"
    },
    "papermill": {
     "duration": 0.015026,
     "end_time": "2025-02-19T13:31:02.627300",
     "exception": false,
     "start_time": "2025-02-19T13:31:02.612274",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"outcomeType\": 1, \"valueTowardsCompletion\": 0.5, \"interactionType\": 1, \"questionType\": 1, \"questionId\": \"2_HomeDescription\", \"learnToolsVersion\": \"0.3.4\", \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\"}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc33\">Correct</span>"
      ],
      "text/plain": [
       "Correct"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# What is the average lot size (rounded to nearest integer)?\n",
    "avg_lot_size = 10517\n",
    "\n",
    "import datetime\n",
    "# As of today, how old is the newest home (current year - the date in which it was built)\n",
    "newest_home_age = datetime.datetime.today().year - 2010\n",
    "\n",
    "# Checks your answers\n",
    "step_2.check()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "f6b2a386",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2025-02-19T13:31:02.638597Z",
     "iopub.status.busy": "2025-02-19T13:31:02.637895Z",
     "iopub.status.idle": "2025-02-19T13:31:02.649852Z",
     "shell.execute_reply": "2025-02-19T13:31:02.648542Z"
    },
    "papermill": {
     "duration": 0.019742,
     "end_time": "2025-02-19T13:31:02.651665",
     "exception": false,
     "start_time": "2025-02-19T13:31:02.631923",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 2, \"questionType\": 1, \"questionId\": \"2_HomeDescription\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#3366cc\">Hint:</span> Run the describe command. Lot size is in the column called LotArea. Also look at YearBuilt. Remember to round lot size "
      ],
      "text/plain": [
       "Hint: Run the describe command. Lot size is in the column called LotArea. Also look at YearBuilt. Remember to round lot size "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 3, \"questionType\": 1, \"questionId\": \"2_HomeDescription\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc99\">Solution:</span> \n",
       "```python\n",
       "# using data read from home_data.describe()\n",
       "avg_lot_size = 10517\n",
       "newest_home_age = 15\n",
       "\n",
       "```"
      ],
      "text/plain": [
       "Solution: \n",
       "```python\n",
       "# using data read from home_data.describe()\n",
       "avg_lot_size = 10517\n",
       "newest_home_age = 15\n",
       "\n",
       "```"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "step_2.hint()\n",
    "step_2.solution()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f78f949f",
   "metadata": {
    "papermill": {
     "duration": 0.006207,
     "end_time": "2025-02-19T13:31:02.663301",
     "exception": false,
     "start_time": "2025-02-19T13:31:02.657094",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "## Think About Your Data\n",
    "\n",
    "The newest house in your data isn't that new.  A few potential explanations for this:\n",
    "1. They haven't built new houses where this data was collected.\n",
    "1. The data was collected a long time ago. Houses built after the data publication wouldn't show up.\n",
    "\n",
    "If the reason is explanation #1 above, does that affect your trust in the model you build with this data? What about if it is reason #2?\n",
    "\n",
    "How could you dig into the data to see which explanation is more plausible?\n",
    "\n",
    "Check out this **[discussion thread](https://www.kaggle.com/learn-forum/60581)** to see what others think or to add your ideas.\n",
    "\n",
    "# Keep Going\n",
    "\n",
    "You are ready for **[Your First Machine Learning Model](https://www.kaggle.com/dansbecker/your-first-machine-learning-model).**\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9eb330ce",
   "metadata": {
    "papermill": {
     "duration": 0.005115,
     "end_time": "2025-02-19T13:31:02.673505",
     "exception": false,
     "start_time": "2025-02-19T13:31:02.668390",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "---\n",
    "**[Machine Learning Course Home Page](https://www.kaggle.com/learn/machine-learning)**\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kaggle": {
   "accelerator": "none",
   "dataSources": [
    {
     "datasetId": 2709,
     "sourceId": 38454,
     "sourceType": "datasetVersion"
    },
    {
     "datasetId": 108980,
     "sourceId": 260251,
     "sourceType": "datasetVersion"
    }
   ],
   "isGpuEnabled": false,
   "isInternetEnabled": false,
   "language": "python",
   "sourceType": "notebook"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  },
  "papermill": {
   "default_parameters": {},
   "duration": 5.262345,
   "end_time": "2025-02-19T13:31:03.302837",
   "environment_variables": {},
   "exception": null,
   "input_path": "__notebook__.ipynb",
   "output_path": "__notebook__.ipynb",
   "parameters": {},
   "start_time": "2025-02-19T13:30:58.040492",
   "version": "2.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
